Optimal paths for symmetric actions in the unitary group"* 

Jorge Antezana, Gabriel Larotonda and Alejandro Varela 



Abstract 

f^ ' Given a positive and unitarily invariant Lagrangian H defined in the algebra of 

OQ , Hermitian matrices, and a fixed interval [a, 6] C R, we study the action defined in the 

^. Liegroupofnx nunitary matrices ^(.) by 

m 






S(a) = / C{a{t))dt, 

J a 



where a : [a, b] — > U{n) is a rectifiable curve. We prove that the one-parameter 
r^ , subgroups of U{n) are the optimal paths, provided the spectrum of the exponent 

is bounded by vr. Moreover, if C is strictly convex, we prove that one-parameter 
subgroups are the unique optimal curves joining given endpoints. Finally, we also 

,^ ^ study the connection of these results with unitarily invariant metrics in U{n) as well 

3 ' as angular metrics in the Grassmann manifold. |j 



1 Introduction 

>, 

O^ . The group of n x n complex unitary matrices l/({n) carries, as any Lie group, a canonical 

tH- . connection without torsion defined on left-invariant vector fields X,Y as VxY = 2!"^'^]' 

^N I whose geodesies are the one-parameter groups 1 1— t- C/e*^ (here U is a unitary matrix and Z 

^^ I an anti-Hermitian matrix). We can introduce a Riemannian metric on the unitary group 

in a standard fashion 

{X,Y)g = Tr{U*X{U*Yy) = Tr{XY*), 

/\^ • for U*X,U*Y in the Lie algebra of the group, that is, for U*X,U*Y anti-Hermitian 

j^ ' matrices. It is well-known that the connection just introduced is in fact the Levi-Civita 

connection of the metric g induced by the trace, and that geodesies are short provided the 
spectrum of Z is bounded by vr (see for instance [3j ) . 
Now consider the bi-invariant Finsler metric given by the spectral norm, 

uyu iirr*Trll \\y\\ 
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for any X tangent to a unitary matrix U . Remarkably, if one keeps the connection but 
changes the metric, the geodesies of the connection are stih short for the induced rectifiable 
distance (which, as in the Riemannian setting, is computed as the infimum of the length of 
piecewise smooth curves joining given endpoints, and L{a) = L \\a\\dt). The same result 
was also proved in Q, using techniques of variational calculus, if the Finsler metrics are 
given by the p-Schatten norms for p > 2. This raises a natural question: what do these 
norms have in common that could imply this phenomenon? A possible answer could be 
that all these norms are unitarily invariant, thus they induce bi-invariant metrics on the 
unitary group. One of the main obstacles to deal with general unitarily invariant norms, 
is that variational arguments become untractable if the norm is not smooth enough. 
In this article we prove that this is the right answer, and introduce a new approach that 
simplifies considerably the technicalities. It is based in a beautiful and deep result due to 
Thompson on the product of exponential matrices (Theorem 12.11 below) . 
Our approach also works for more general optimization problems described as follows: fix 
a bounded interval [a, b] C M, and let S be the action defined on piecewise C^ curves 
a : [a, b] — )• l{{n) by 



S{a) = / C{a{t))dt, 

J a 



where £ is a Lagrangian defined in the algebra of n x n matrices, with the following unitary 
invariance property: for every n x n matrix A, and every pair of n x n unitary matrices 
U andV 

£{UAV) = C{A). (1) 

As usual, it is asked that the Lagrangian is a convex and positive map, and without loss 
of generality we will assume that C{0) = 0. A Lagrangian that satisfies these properties 
will be called symmetric Lagrangian. Two classical examples of symmetric Lagrangians 
are: 

• An unitarily invariant norm || • H^; 

• The kinetic energy E{A) = ||j4|||,, where || • \\f denotes the Frobenius norm. 

In the first case, we recover the geometric context mentioned above, because the action S 
defines the length of a associated to the Finsler structure that considers the norm || • H,^ 
in each tangent space. Note that in this case, S does not depend on the parametrization 
of a. So, there is no significative difference between the problem of finding a curve that 
minimizes S among all piecewise C^ curves or among all piecewise C^ curves with a given 
interval of parameters. 

However, in the second example, the action associated to the kinetic energy depends on 
the parametrization. Let a : [a, b] — )• ZY(n) be a smooth curve. A simple change of variable 
shows that, if we take the family of curves a^ : [ra, rb] — ;■ U{n) defined by ar{t) = a(t/r), 
then r i— t- S{ar) is a non-increasing function for r G (0, +oo). The same phenomenon also 



holds for any other convex Lagrangian. This suggests that in order to find a minimum we 
should fix the length of the interval of parameters. This is also suggested by considering 
the example of the energy functional, where the parameter t should be interpreted as the 
time parameter. 

As translations of that interval do not change the value of S{a), without lost of generality 
we can consider intervals of the form [0,6]. So, the optimization problem that we will 
study is the following: 

Problem 1. Given U,V £ hl{n) and 6 > 0, find the piecewise C"^ curves 7 : [0, 6] — )■ hl{n) 
such that 7(0) = [/, 7(6) = V and 7 minimizes the action given by 



S{a) = / C{a{t))dt 
Jo 



where £ is a given symmetric Lagrangian. 

The second question that arises is whether the minimal paths, when they exist, are unique 
or not, or if they are unique modulus a reparametrization of the path. Thus we will study 
the following: 

Problem 2. Given U,V G U{n), b > 0, and a minimizing function 7 : [0, b] — )■ U{n) with 
7(0) = U, 7(6) = V, is this function the unique minimizer of the Lagrangian for the given 
endpoints? Is it true that any other minimizing curve with this given endpoints is just a 
reparametrization of 7? 

2 Preliminaries 

Throughout this paper 7W„(C) denotes the algebra of complex n x n matrices, Gl{n) the 
group of all invertible elements of 7W„(C), U{n) the group of unitary n x n matrices, and 
'H(n) the real subalgebra of Hermitian matrices. If T G A^„(C), then ||T|| stands for the 
usual spectral norm, | • | indicates the modulus of T, i.e. |r| = \/T*T, and tr(T) denotes the 
trace of T. Given A G 'H(n), Ai (A) > . . . > A„ (A) denotes the eigenvalues of A arranged 
in non- increasing way, and given an arbitrary matrix T G A^„(C), si (T) > . . . > s„ (T) 
denotes the singular values of T, i.e. the eigenvalues of \T\. We will use X{A) (resp. s{T)) 
to denote the vector in M" consisting of the eigenvalues of A (resp. the singular values of 
T). Finally, given A,B & 'H(n), by means of A < B we denote that A is less that or equal 
to B with respect to the Lowner order. 

2.1 Product of exponentials 

We begin this subsection with the following remarkable result: 



Theorem 2.1 (Thompson [TT]). Given X,Y & T~L{n), there exist unitary matrices U and 
V such that 

^iX^Y ^ ^i{UXU*+VYV*) 

We will use the following corollary of Thompson's theorem: 

Corollary 2.2. Let X,Y,Z e Uin) be such that \\Z\\ < vr and e*^e*^ = e*^. Then, there 
are unitary matrices U and V such that \Z\ < \UXU* + yyy*|. 

Proof. By Thompson's Theorem it is enough to prove that, ii X,Y £ 'H{n), e*^ = e*^, 
and ||X|| < vr, then \X\ < \Y\. Let Y = X^^gpj f/n e„ (gi e„ be a spectral decomposition of 
Y. If A = {n : e*^" = -1}, then 



\X\ = ttP + ^ \Hn\ en (Si en, 



where P is the spectral projection of X onto the subspace generated by the eigenvectors 
associated to zbvr, and the eigenvalues )U„ G {—t^, tt) satisfy that e*^" = e*^" for every 
n ^ A. Clearly PY = YP and P|X|P < P|y|P. On the other hand, since |//„| < \rin\ for 
every n ^ A, we also obtain that (1 - P)\X\{1 - P) < (1 - P)\Y\{1 - P). ■ 

Another result due to Thompson is the following triangle inequality for the modulus of 
matrices: 

Theorem 2.3 (Thompson [El [16]). Given A, B €^ A^„(C), there exist unitaries V and 
W such that 

\x + Y\< v\x\v* + w|y|VF*. 

Combining this result with Corollary 12.21 we get: 

Proposition 2.4. Let m > 2, and consider X,Xi, . . . ,Xm £ 'H(n) such that \\X\\ < vr 
and 

m 

Then, there exist unitary matrices Ui, . . . , Um such that |X| < > Uk\Xk\U^. 

k=l 

Proof. For ttt, = 2 it is a direct consequence of Corollary 12.21 and Theorem 12.31 Suppose 
that the result is proved for m = k. Then, given X,Xi, . . . ,Xk+i G ?^(n) such that 
||X|| < TT, let Y £ n{n) be such that ||y|| < vr and 

By the inductive hypothesis, there exist unitary matrices V2, ■ ■ ■ , V^+i such that 

fe+i 

\Y\<Y,v,\x,\v;. 

i=2 



On the other hand, since e^^ = e^-^^e^^ , by the case n = 2 already proved, there are 
unitary matrices Ui and U such that \X\ < Ui\Xi\Ui + U\Y\U* . If we define Uj = UVj 
for j > 2, then we get the desired result. ■ 

2.2 The Lagrangians 

Let us list in the following proposition several properties of the symmetric Lagrangian 
that will be used in the sequel: 

Proposition 2.5. Let C : 7W„(C) — t- [0,cxd) be a symmetric Lagrangian, i.e. convex, 
C{0) = 0, and unitarily invariant in the sense of equation ([1]). Then 

(PI) C is continuous, 

(P2) C{tA) < tC{A) for every t G [0, 1], 

(P3) C{A) < C{B) provided Q<A<B, 

(P4) There exists (j) : M" — )• [0, +oo) such that C{A) = (/>(s(^)). This (p is invariant under 
rearrangement, positive, convex, with </)(0) = and (f){x) < 4>{y) if x,y £ R^ and 
Xi < Vi for i = 1 . . .n. 

Proof. The first property is clear because every convex function in a finite dimensional 
vector space is continuous. Also (P2) is a consequence of the convexity and the fact that 
Cifi) = 0. As vC is unitarily invariant, the singular value decomposition implies that C{A) 
only depends on the singular values of A. Hence, if x G M^ and diag(x) denotes the nx n 
diagonal matrix whose diagonal entries correspond to the coordinates of x, we can define 
(j){x) = £(diag(x)); clearly 0(0) = 0, it is non-negative and convex. Convexity implies 
that if x,y G M" and Xj < yi for i = 1,. . . ,n, then </>(x) < </>(y). This proves (P4), and 
(P3) is a direct consequence of it. ■ 

Remark 2.6. Let (p : M" — t- [0, +oo) be a rearrangement invariant, positive and convex 
function, with (/>(0) = 0. Then (p gives place to a symmetric Lagrangian jC^ via the equation 
jO.(f){A) = (p{s{A)). Note that the natural extension of </> to M" is strongly Schur convex, 
but not necessarily subadditive. 

3 Optimality of one parameter subgroups 

A geodesic segment is a curve t i— )• C/e** for Z G Tiin) and U G lA{n). In this section 
we prove that the geodesic segments (which are parametrized with constant velocity) are 
optimal for Problem [TJ Moreover, if C is strictly convex, then we will prove that these 
geodesic segments are the unique optimal paths. 



3.1 Geodesic segments are short 

Definition 3.1. A polygonal path is a broken geodesic, that is, a curve P : [0,6] — t- U{n) 
such that there is a partition of the interval [0, b] given by the points = to < ... < t^ = b, 
Herminitian matrices Xi,. . .,Xk with norm less than or equal to tt, and U G U{n) so that 

iUe^^' ift£[0,ti] 

1^ [/e*^i . . . e'^^-^e *^-*^-i ' if t £ [tj-i,tj] (j > I) 

Our first step toward tfie proof of tlie optiniality of tlie geodesic segments with constant 
velocity is the fohowing proposition, which proves that segments are better than polygonal 
paths. 

Proposition 3.2. Let U £ U{n) and V = C/e*^, with Z £ Uin) and \\Z\\ < tt. Let 
7 : [0,6] — )• L({n) be the segment 7(t) = Ue^^~b , and P : [0,6] — )■ U{n) a polygonal path 
joining U to V. Then S{P) > 5(7). 

Proof. Let = to < • • • < t^ = b, and Xi,. . .,Xk G T-L{n) with norm less than or equal to 
TT, so that P has the form showed in ([3]) . Then 

s{p) = y: r c{p{t)) dt=Y: r c (-^) dt 

k 

= E(i,-t,^i)c(-^) (4) 



i=i 



tj tj_l 



On the other hand, since e*^ = e^-^^ ■ ■ ■ e^'^'' and ||Z|| < vr, by Proposition 12.41 there exist 
unitary matrices C/i , . . . C/„ such that 

n 

\Z\<Y,Uk\Xk\Ul (5) 

fc=i 

Then, joining (HD and ^, and using the properties of C we obtain 

k 



>i"^ilEv,\x,\u;]>i>c{j 



To prove that geodesic segments are optimal paths among all the possible piecewise C^ 
curves, we need the following standard approximation result by polygonal paths. 

Lemma 3.3. Let a : [0,6] — )■ U{n) be piecewise smooth. Then for any e > there is a 
polygonal path P^ : [0, b] — t- U{n) such that for any t € [0, b], 

\\P:{t)P,{t)-a*{t)a{t)\\ <e. 

Proof. We may as well assume that a is smooth in [0,6]. Recall that a, a are continuous 
in the uniform norm. Let e > 0, and choose a partition = to < ti < • • • < tn = 6 of the 
interval [0, b] such that, for any /c = 0, 1, • • • , n, 

||a(t) -a(s)|| < 2 and \\a* {t)a{t) - a* {s)a{s)\\ < - 

if s,t £ [tkjtk+i]- The first condition implies that there exist Z^ € 'H(n) such that 
\\Zk\\ < IT and e* *= = a*{tk)a{tk+i). Moreover, if log denotes the principal branch of the 
logarithm, then 

Zk = log{a*{tk)a{tk+i)). 

Now note that, for any fixed t G [0, 6], the map g : h ^^ -^ \og{a*{t)a{t + h)), is well-defined 
and analytic, for sufficiently small h. Moreover 



gih) > —\oga*{t)ait + s] 

h^o as 



= a*{t)a{t). 

s=0 



Then, taking a refinement of the partition if necessary, we can also assume that 

\\Zk-a*{tk)a{tk)\\ <^ 
for any A; = 0, 1, 2 • • • , n. Consider the map P^ ■ [0, b] — )■ U{n) which is defined as 

P,{t) = a(tfc)e*'=+i-*fe^' for t G [tk,tk+i]. 

Then P^ is certainly a polygonal path, and it is straightforward to see that verifies the 
claim of the lemma. ■ 

Theorem 3.4. Let U e U{n) and V = Ue^^ , with Z e Tiin) and \\Z\\ < tt. Then, the 
curve 7(t) = ue ' is optimal among piecewise smooth curves a : [0, 6] — )■ L{{n) joining U 
to V , with respect to the action S defined by a symmetric Lagrangian, and in particular 
miS = bC{Z/b). 

Proof Given e > 0, let 6 > such that ||X -Y\\ <6 implies that \C{X) - C{Y)\ < e/b 
for every X and y in a ball big enough. Then, let Ps be a polygonal path in U{n) as in 
the previous lemma, joining U to V, such that 

Ijd - p^ll = ||a*Q - P^PsW < 6. 



Then by Proposition 13.21 

Si-f)<SiPs)= C{P{t))dt<£+ C{d{t))dt<e + Sia), 
Jo Jo 



rb rb 



Therefore, 5(7) < 5(a). ■ 

Remark 3.5. If a : [0,6] — )■ U{n) is just rectifiable (that is, differentiable p.p. with 
a{t) bounded), the approximation by a polygonal path can be carried out with no major 
changes, and the proof of the previous theorem shows that in fact, geodesic segments are 
optimal among rectifiable arcs joining given endpoints. 

3.2 Uniqueness of short paths 

Concerning uniqueness, it is clear that the convexity condition of £ should be strenghtened. 
Let us agree to call £ nondegenerate if, given A,B ^ T~i{n), the existence of A G (0, 1) such 
that the inequality of the convexity condition turns into an equality, implies that there 
exists s > such that A = sB. In other words, if 

C{XA + (1 - \)B) = \C{A) + (1 - \)C{B) 

for some A G (0,1), then A = sB for some s > 0. This is a notion of nondegeneracy 
outside lines. 

The other notion at play here is the strongest notion of strict convexity of £, which of 
course means that if the equality above holds for some A € (0, 1), then A = B. A simple 
example of a strictly convex Lagrangian is the energy functional, given by the square of 
the Frobenius norm on T-L{n). 

Remark 3.6. Note that strict convexity implies nondegeneracy, but the notion of nonde- 
generacy is relevant since no linear space norm can be strictly convex. In fact, it is usual 
to say that a norm || • || on a linear space is strictly convex when the weaker condition 
(nondegeneracy) stated above holds, which due to the homogeneity of the norm amounts 
to say that 

P + B|| = ||yl|| + \\B\\ 

implies A = sB for some s > 0, and geometrically, is equivalent to the fact that the unit 
ball of the normed space has no segments. 

We begin with a technical lemma. Recall that if A G 'H{n), then Ai (A), . . ., A„ {A) denotes 
the eigenvalues of A arranged in non-increasing way. 

Lemma 3.7. Let X,Y,Z G nin) be such that e^ = e'^e'^ and \\Z\\ < vr. If Xk (X) = 

rXk (Z) and Xk (Y) = (1 — r)Xk (Z) for some r £ [0, 1] and every k G {!,... ,n}, then 
X = rZ andV = {l-r)Z. 



Proof. It is enough to show that Z shares an orthonornial basis of eigenvalues with X and 
Y. Let ^ be an unitary eigenvector of Z such that \Z\^ = \\Z\\(^. Consider the unit sphere 
^„_i ^ ^n g^^^ ^Yie maps a, f3 : [0, 1] -^ 5"-^ given by a{t) = e'*^^, 



/^(*) - \ „iX„2i{t-l/2)Y 



^2itx^ if t G [0, 1/2] 
e if tG [1/2,1] 



In particular, a and /3 have the same extreme points. A simple computation shows that, 
with respect to the natural Riemannian structure, Long(a) = fi and Long(/3) < /x. But, 
since 

a{t) = e^*^(-Z2)^ = -e^^^lZpe = -||Zf e^*^^ = -\\Zfa{t) 

and Long(Q) = \\Z\\ < it, then a is the unique short geodesic of the sphere S^~^ joining ^ 
with e'^. So, Graph(a) = Graph(/3) and ^ is also an eigenvalue of X and y. Iterating 
this procedure, we can conclude that X, Y and Z share a common orthonornial basis of 
eigenvalues. ■ 

Theorem 3.8. Assume that C is strictly convex. Let X,Y €z T-L{n) with norm less or equal 
than TT, and Z G ^(n) such that \\Z\\ < tt and e*^ = e'^^e^'^ . Consider the geodesic segment 
7 : [0, b] — )• l{{n) defined by j{t) = e**^''', and the polygonal P : [0, 1] — )■ U{n)defined by 

e'^^ i/iG[0,to] 



iX^H-tl^ i/tG[to,6] 



for some to e (0,&). //5(P) = 5(7) then X = ^fZ and P = 7. 

Proof. By Proposition 12.11 there exist unitary matrices U and V such that 

e'^ = e'iuxu'+VYV*) ^^^ |^| < jf^^^* ^ yYv*\ , 

and by the computations made in Proposition 13.21 (Equation @) 

S{P)=toc(f]+{b-to)C^ ^ 



to J \b-to^ 

Then, using the properties of £, the hypothesis S{P) = 5(7) implies that 

Si^)=SiP)=toc(f]+ib-to)c'^ ^ 



to J \b-to 



>bC^ 1 j>6£(-)=5(7). 



On one hand, this imphes that Z = UXU* + VYV*. Indeed, if W = UXU* + VYV* 
then \Z\ < \W\. But the above chain of identities imphes that C{Z) = C{W), and 
(P2) in Proposition [23] imphes that \Z\ = \W\. Hence, < \Z\ = \W\ < tt. Since 
g«z _ ^i{UXU*+VYV*) ^g gg^ ^j^g (jggirgfj equahty. On the other hand, since £ is strictly 
convex if r = to/b then 

rZ = UXU* and {l-r)Z = VYV*. 

Now, by Lemma 13.71 we obtain that X = UXU* and Y = VYV* which concludes the 
proof. ■ 

Theorem 3.9. Assume that C is strictly convex. Let Z G ^{{n) be such that \\Z\\ < vr. 
Then, the geodesic segment 6 : [0,6] — )• U{n) defined by 7(t) = Ue^*^'^ is the unique 
piecewise C^ curve in U{n) joining U to V = C/e* , and S{5) = hC{Z/b). 

Proof. Without lost of generality we can assume that {7=1. Suppose that a is any short, 
piecewise smooth curve joining 1 to e^^ . Let t^ G (0, 1) and let a(to) = e*^ = e*'^e~*^, 
with ||y|| < vr, ||X|| < vr. Consider the polygonal P : [0,6] -^U{n) defined by 

e^^ if t G [0, to] 



^ix/tJtl^ iftG[to,6] 



Then, by Proposition 13.21 and Theorem 13.41 applied to each segment, 

C{a)dt+ / 
Jto 



5(7) < S{P) < [ ° £(d) dt+ [ C{a) dt = 5(a) = 5(7), 

Jo Jtn 



Hence 5(7) = S{P), and by Theorem ES] we get that X = ^Z. ■ 

This settles Problem [2] when the Lagrangian is strictly convex: the geodesic segments 
are optimal and unique as functions. Regarding the second question of that problem, we 
have the following result, that settles this poblem when the Lagrangian is nondengenerate 
(for instance, if £ is a strictly convex norm on a linear space. Remark 13. 6p : in this case, 
geodesic segments are optimal and unique modulo a reparametrization of the path, that 
is, they are unique in a geometrical sense. 

Theorem 3.10. Assume that £ is nondegenerate. Let Z G 'H(n) be such that \\Z\\ < vr. 
Then, if a : [0, 6] — )• U{n) is an optimal path of the minimization problem given by £ with 
given endpoints U, V , a must be a reparametrization of the geodesic segment 7 : [0, 6] — )• 
U{n) defined by -f{t) = Ue'^^/K 

Proof We assume that U = landV = e^^. Let to £ (0,1) and let a(to) = e*^ = e^^e'^^, 
with ||y|| < vr, ||X|| < vr. Arguing as in the proof of Theorem 13.81 convexity of £ and 



10 



minimality of a imply that Z = UXU* + VYV*. Now, nondegeneracy of C implies also 
that there exists s > such that 

UXU* VYV* 
s- 



to b-to 

Now we take sq = i;^;^ > and r = (1 + sq)^^- Note that r G [0, 1] and also that rZ = 
UXU*, (1 - r)Z = VYV*. Invoking once again LemmaEZl it follows that X = UXU*, 
Y = VYV*. Thus a(to) = e**" and then a must be a reparametrization of the geodesic 
segment 7. ■ 

Regarding uniqueness of paths when ||[7 — y|| = 2 (or equivalently, when V = Ue^^ and 
||Z|| = vr), this property is not expected since taking n = 1, U = 1, V = —1 shows 
that there are two geodesic segments in the circumference (= U{1)) joining U, V, and the 
situation worsens as n gets bigger. 

4 Rectifiable distances in U{n) and angular metrics in the 
Grassmann manifold 

In this section, we focus in the particular case where £ is a unitarily invariant norm. In 
that case the action S defines a length of curves and the length of the optimal path defines 
a distance in U{n). 

4.1 Unitarily invariant norms and symmetric gauge functions 

One of the most relevant properties of the uniform norm of matrices is the following: given 
two unitary matrices U and V, then IlL'^TV^H = ||T||. This property is shared by many 
other norms defined in 7W„(C). 

Definition 4.1. A norm \\\ ■ \\\ defined in AdniC) is called unitarily invariant if for every 
matrix T and every pair of unitary matrices U and V it holds that \\\ UTV\\\ = \\\ T\\\ . 

As a consequence of the singular value decomposition, ||| r||| = ||| \T\ ||| , and 

III Till = iinu = </)(s(r)) , (6) 

where (/) is a symmetric gauge function, that is, a rearrangement invariant norm on R", 
and depends only on the moduli of the coordinates of the vectors. The next theorem [5] 
will be useful in what follows: 

Theorem 4.2. There is a bijection bewtween symmetric gauge functions (p on M", and 
unitarily invariant norms \\ ■ \\^ on A^„(C) given by equation (^ above. 



11 



4.2 Rectifiable metrics in the unitary group 

By considering as a Lagrangian a unitarily invariant norm || • H,^, the action S can be 
interpreted as the length of curves L^, and the rectifiable distance between U,V £ U{n) is 

(i(^(C/, V) = inf {^0(7)1 7 : [a, 6] — ;■ U{n) is piecewise smooth and joins C/ to ^ in U{n)} . 

The function d^ is in fact a distance, since \\U — V\\(j) ~ d^{U,V) for any U,V £ U{n). 
One of the main features of this metric is that it is invariant for the action of the unitary 
group U{n), in fact it is a bi- invariant metric 

d^UViW, UV2W) = d4Vi,V2) 

for U,W,Vi,V2 £U{n). 

4.2.1 Minimality of one-parameter subgroups 

As a direct consequence of Theorem 13.41 and Theorem 13. 101 we obtain the following result, 
which generalizes [H Theorem 3.2] for the p- norms {p > 2), see also |11] . 

Theorem 4.3. Let U,V e U{n) and V = Ue^^ , with \\Z\\ <'K,Ze n{n). Then, the 
curve 5{t) = Ue is shorter than any other piecewise smooth curve 7 in lA{n) joining U 
to V, when we measure them with the norm \\ ■ ||<^. In particular, d^{U,V) = \\Z\\(f). If 
\\U — V\\ < 1 (equivalently, if \\Z\\ < tt), then this 5 is the unique short path joining U,V 
in U{n) provided the norm is stricly convex. 

Remark 4.4. A question related to the uniqueness of geodesies, is if we can ensure that 
the points in hl{n) are aligned when the distance is additive. That is, if 

d^{U,V) = d^{U,W) + d^{W,V). 

implies that there exists to G [0, 1] and Xq € 'H(n) with \\Xq\\ < vr such that 

V = Ue'^\ while W = Ue'^°^''. 

The previous theorem implies this when ||C/ — y|| < 2. However, the question always has 
an affirmative answer (provided the norm is strictly convex), with a simpler proof. 

Theorem 4.5. Assume that the norm \\ ■ ||^ is strictly convex, and let U,V,W £ U{n) he 
such that 

d^{U,V) = d^(U,W) + d^{W,V). 

Then U, V, W are aligned in U{n). 
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Proof. We can assume that U = 1, V = e*^, W = e^^ with X, Z of norm less or equal 
than vr. Let Y G 'H{n) such that ||y|| < vr and e*^ = e^^e'^^ . Then the hypothesis is that 

||'7|| ll\^ll I 11X^11 

11^110 — 11^110 + Ir Il0- 

Consider the smooth path a{t) = e**^e**^. Then a joins the same endpoints that 6{t) = 
^%tz jj^ U{n), thus 

\\X + Y\\^ = L^{a) > L^{5) = \\Z\\^ = \\X\\^ + \\Y\\^. 

Since the norm is strictly convex, there exists A > such that Y = \X. Pick Xq = {1+X)X 
and to = (1 + A)^^ to finish the proof. ■ 

4.3 The Grassmannian 

The Grassmannian Qn is the set of subspaces of C", which can be identified with the set of 
orthogonal projections in A^„(C). If we consider in A^n(C) the topology defined by any 
of all the equivalent norms, the Grassmann space endowed with the inherited topology 
becomes a compact set. However, it is not connected. Indeed, it is enough to consider 
the trace tr, which is a continuous map defined on the whole space A^„(C), and restricted 
to Qn takes only positive integer values. In particular, this shows that the connected 
components of Qn are the subsets Qm,n defined as: 

Qm,n ■■= {P (^ Qn : tr(P) = m}. 

Each of these components is a submanifold of A4„(C) [IS, p. 129], and connected compo- 
nents are given by the unitary orbit of a given projection P such that tr{P) = m: 

Qm,n = {UPU* : U £ U{n)}. 

The tangent space at a point P G Qm,n can be identified with the subspace of P-codiagonal 
Hermitian matrices, i.e. 

TpQn = {X G n{n) : X = PX + XP} . 

In particular note that TpQn has a natural complement A'^p, which is the space of Her- 
mitian matrices that commute with P, that is, the P-diagonal Hermitian inatrices. The 
decomposition in diagonal and codiagonal matrices defines a normal bundle, and leads to 
a covariant derivative 



Vyr(P)=H,^,l^^|r(a(t)) 



(7) 

t=0 



where F is a vector field along the curve a : (— e,e) — )• Qm,,n that satisfies a(0) = P and 
d(0) = V . So, we have a notion of parallelism, and the geodesies in this sense are described 
by the following theorem: 
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Theorem 4.6 (Porta-Recht [l3]). The unique geodesic at P with direction X is: 

As the unitary group acts transitively in these components via U ■ P = UPU*, they are 
also homogeneous spaces of U{n). They can be distinguished from other homogeneous 
submanifolds oiU{n), because the map 

P^ Sp = 2P-l 

embeds them in U{n), and the map 5 is two times an isometry. The images Sp are 
symmetries, i.e. matrices that satisfy Sp = Sp = Sp . 

4.3.1 Finsler metrics on the Grassmannian 

For a given symmetric norm, the Grassmann space carries the Finsler structure given by 

II "V^ll II \^ll 

11^ \\P — 11^110 

for X G TpQn, and with this structure, the Grassmann component {UPU* : U G i^{n)} 
is isometric (modulo a factor 2) to the orbit of symmetries {USpU* : U G U{n)}. In 
the particular case when || • \\^ is the Frobenius norm, this connection is the Levi-Civita 
connection of the metric, since the P-diagonal matrices are the orthogonal complement of 
the P-codiagonal matrices with respect to this Riemannian metric. 

A straightworward computation shows that, ii X = XP + PX, then e^-^ Sp = Spe~^-^ . 
This simple observation enables to use our results in the unitary group, to prove minimality 
of geodesies in the Grassmann manifold: 

Theorem 4.7. If P,Q ^ Qm.n then there exists X G TpQn such that Q = e"^-^ Pe^^^ and 
ll^ll < f ; unique when \\P — Q\\ < 1. The geodesic 7(t) = e**"^Pe~ is shorter than any 
rectifiable path in Qn joining P, Q and 

d^{P,Q) = \\XP - PX\\^ = \\X\\^. 

If the norm is strictly convex and \\P — Q\\ < 1, the geodesic is the unique short path 
joining P,Q ^ Gn- 

Proof. The existence of X follows from Halmos [8] or Davis and Kahan [6] . Since e"^^^ = 
SqSp, if \\Q — P|| < 1 this X is unique. Since 

5^(^) = 27(t) - 1 = e'^^Spe-''^ = e^^^ Sp = Spe~^''^ , 

and S is two times an isometry, the minimality of 7 follows from Theorem 14.31 and the 
same applies to the uniqueness in the strictly convex case. Finally, L^{'^) = \\XP — PX\\^, 
and on the other hand, since PXP = then 

\XP - PX\^ = \XP + PX\^ = |X|2, 

thus d^{P,Q) = L47) = IIIXP -PXIII^ = |||X|||<^ = ||X||^. ■ 
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Remark 4.8. In the situation of the previous theorem, it is not hard to see that if 
A; e Z, then PX^'' = X^^P, PX'^^+^ = -PX^^+\ Then P\X\ = \X\P = \XP\ and 
(1 - P)\X\ = \X\{1 - P) = \PX\. Moreover 

Q = Pcos^ x + {l-P) sin^ X - -Psm2X + -(1 - P) sin 2X, 

and then \PQ\'^ = PQP = Pcos'^X, which leads to \PQ\ = PcosX = cos|XP|, and 
Ukewise \QP\ = (1 — P) cos X = cos li'-^l- Thus if y E TpQn is any other matrix as X, it 
fohows that P cos X = P cos Y or equivalently, 

cos|XP| = \PQ\ =cos|yP|. 

4.4 The angular metrics 

Let X and y be two ?7i-dimensional subspaces of C", and let Px and Py be the orthogonal 
projections onto X and y respectively. The principal angles between X and y are the 
angles 9i{X,y), . . . ,9m{'V,y) £ [0, 7r/2) whose cosines are the m greatest singular values 
of PxPy, see [9]. 

In [10] Li, Qiu, and Zhang used the principal angles to define metrics in the components 
of Gm,n- Given a symmetric norm || • H,/,, they define for P,Q £ Qm,n the following distance: 

p^{P, Q) = II arccos \PQ\ \\^. 

These distances are called angular metrics, because if (p is the symmetric gauge function 
associated to || • ||<^ then 

p^{p,Q) = (l){ei{x,y),. . . ,em{x,y),Q,. . . ,Q). 

where X = R{P) and y = R{Q). The definition of these metrics was motivated not only 
by pure mathematics but also by engineering applications. For example, in robust control, 
a linear time-invariant system can be described by a subspace valued frequency function, 
and the description of an uncertain system needs a suitable distance measure between 
subspaces. The reader is referred to [TD], where other motivations and applications of 
these metrics are described. 

A legitimate question at this point, is if these distances are related to an infinitesimal 
structure on the manifold ^„, that is, if the angular distance among P,Q £ Gm,n can be 
computed as the infima of the lengths of the rectifiable arcs joining P, Q. Note that, by 
Remark 14.81 if X is as in Theorem 14.71 then the angular distance among P, Q can be 
computed as 

p^{P,Q) = II arccos IPQIII^ = ||XP||^ 

and this computation does not depend on the particular X. Then, one can be tempted to 
endow the Grassmannian with the Finsler metric (i.e. tangent norm) given by ||^||p = 

15 



||XP||0 for X G TpQn- The problem with this definition is that it is not clear how to 
extended it to the whole A^„(C) in order to obtain an unitarily invariant norm there. 

To this end, it suffices to consider the case m < n/2. Let (j) be the symmetric gauge 
function associated to || • H,^ (see Theorem 14. 2p . and define || • ||^ in the following way: 

\\A\\^ = Ct){l/2{si {A) + S2 (A) , . . . , 52^-1 {A) + S2m (^) , 0, . . . , 0)) , (8) 

where si (^),. ■ ■■,Sn {A) denotes the singular values of A counted with multiplicity and or- 
dered in non- increasing wa}0. Straightforward computations show that || • ||^ is a symmetric 
norm, and also that, for any Q G Gm,n and Z G TqQn it holds 



\QZ\\6 = \\z\ 



v.- 



The following theorem gives the link between the rectifiable distances and the angular 
metrics: 

Theorem 4.9 (Davis-Kahan [6j). Let P,Q G Gm,n, and denote X = R{P) and y = R{Q). 
Then, if X ^ ^{n) is P-codiagonal with \\X\\ < 7r/2 and Q = e^-^ Pe~^-^ , its spectrum 
counted with multiplicity is 

Consider the rectifiable distance d^ associated to the norm given in ([8]), and take P,Q,X 
as in Theorem 14.71 Then 

d^{P, Q) = \\X\\^ = 0(1/2(S1 {X) + S2 (X) , . . . , S2ra-1 {X) + S2ra (^) , 0, . . . , 0)) 

= (t>{ei{x,y),...,9^{x,y),Q...,Q) 

= P4>{P,Q), 

by Theorem l4.9l and this establishes the following (obtained by Neretin in [12] with another 
proof): 

Theorem 4.10. Let \\ ■ W^j, he a symmetric norm, and p^ its corresponding angular metric 
in Gm,n- Then, there exists an induced symmetric norm \\ ■ \\^ such that the corresponding 
rectifiable distance d^ coincides with p^. 

Remark 4.11. In [lOi Section 4], the authors prove that when the norm || • H^ is strictly 
convex, if the distance among P,Q,R G Gm,n is additive, then there exists a direct rotation 
from X to Z through 3^, where X = R{P),y = R{Q) and Z = R{R). This last assertion 
is equivalent to the notion of being aligned as introduced in Remark 14.41 Thus the proof 
of this fact follows immediatly from Theorem 14.51 



The arithmetic mean can be replaced by any positive mean. 
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A Appendix: compact operators 

The results of the previous sections can be extended to the infinite dimensional setting 
as follows. Let "H be a complex separable Hilbert space, B{7i) the algebra of bounded 
operators with the supremum norm, IC{'H) the algebra of compact operators, lAil-L) the 
group of unitary operators. Let || • H^ : B{%) — t- MU {00} be a symmetric norm, that is a 
norm such that 

PXB||^ < p||||X||^||5|| (9) 

for A,X,B £ B{%) (both sides can equal 00). In particular, it is unitarily invariant, 
thus it only depends on the singular values of the operator, and as in Theorem |321 there 
is a symmetric gauge function (p : M°° — )■ M>o related to this norm; the relationship 
is somewhat subtle so we refer the reader to Simon's book |14j for full details on these 
symmetrically normed ideals. 

Let X C IC{'H) stand for the ideal of operators with finite norm, which will be assumed 
to be complete with respect to its norm, and let U^ = {u £ U{'H) : u — 1 £ I}. This 
is a Banach-Lie group, whose Banach-Lie algebra can be readily identified with the anti- 
Hermitian part of X, that we will denote with ilh. A straightforward computation using 
the functional calculus and the fact that I is an ideal shows that if ||2'|| < tt is self-adjoint 
and e*^ = U, then Z e X. 

A.l The special unitary groups 

The length functional on h(,f, is defined accordingly as Lfj,{a) = Jq \\oi\\(f,, and the distance 
d(p is defined as the infima of the lengths of curves in W^ joining given endpoints; in order to 
prove minimality of geodesic segments, we will need the following extension of Thompson's 
formula, its proof can be found in |2i Theorem 3.2]: 

Theorem A.l. Given X,Y £ IC{'H)h, there is an isometry w £ B['H) (w*w = 1), and 
unitary operators U and V such that 

^iwXw" ^iwYw* _ ^iU{wXw*)U*+iV{wYw*)V* 

Theorem A. 2. Let U,V gU^, Z £ I such that V = Ue^^ and \\Z\\ < vr. Then, the curve 
7(t) = Ue^^^ is minimal among rectifiable curves a CU^ joining U, V, with respect to the 
distance induced by the length L^, and d(f)(U,V) = \\Z\\^. This curve is unique if the norm 
is strictly convex and \\U — V\\ < 2 (equivalently, \\Z\\ < tt). 

Proof. If Z € X is such that e = e e and \\Z\\ < vr (where we can assume that 
X,Y £ X), then e™^'"* = e''"^'^* e'""^""* for some isometry w £ B<(H) by Theorem lATl 
With the same proof as Corollary 12.21 we obtain 

\wZw*\ < \U{wXw*)U* +iV{wYw*)V*\. 
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Due to ([9]), it follows that 

llrzll II * rz * II ^11 rz *|| ^- \\ x^ II i 1 1 t/" 1 1 

11-^ lU = 11""^ wZw w\\^ < \\wZw \\(f, < w^Wcf, + \\y \\(i, 

since it; is an isometry thus \\w\\ = 1. Now the rest of the proof of minimality of segments 
follows as in Section [3j The uniqueness when the norm is strictly convex can be proved 
invoking Theorem lA.ll and arguing as in the proof of Theorem 13.101 ■ 

A. 2 The restricted Grassmannians 

The same considerations hold for the special Grassmannian manifold, whose components 
can be regarded as unitary orbits of self-adjoint projections P £ BiTi), with the action of 
these special unitary groups: 

g^P) = {UPU* : U G U^}. 

Since U — 1 G X, then the orbit is contained in the affine space P+X. Then tangent spaces 
are identified with 

TpG^P) = {Xelh:XP + PX = X}. 

A well-known result of Halmos [8j says that if P,Q G 13{'H) are self-adjoint projections 
whose ranges have the same dimension (including the posiblity of +oo), and the same 
holds for their kernels, then there exists a P-codiagonal X such that ||X|| < ^ and 
Q = e'^Pe^'^. Since G^ C P+1, it is easy to check that SqSp G U^. Then, e^'^ = SqSp 
is also in U^, and it follows that X G X. 

Corollary A. 3. IfP,Q€ g<p{P) then there exists X G TpG^{P) such that Q = e*^Pe"*^ 
and \\X\\ < ^, unique when \\P — Q\\ < 1. The geodesic 7(t) = e«*^Pg-«*^ is shorter than 
any rectifiable path in Q^{P) joining P,Q and d^{P,Q) = \\XP — PX\\^ = \\X\\fj,. If the 
norm is strictly convex and ||P — Q|| < 1, the geodesic is the unique short path joining 

P,Qeg4P). 

Remark A. 4. When X is the ideal of Hilbert-Schmidt operators, the special Grassman- 
nian defined above is known as the Sato Grassmannian or the restricted Grassmannian. 
The proof of minimality of one-parameter groups in this Riemann-Hilbert setting was 
given in [1] with a different technique. 
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